Skip to main content

All Questions

0votes
1answer
657views

Why does Adam outperform SGD in logistic regression?

I am training a logistic regression model. In case it matters, the features are 1376-dimensional embeddings output from a neural network. I tried both SGD and Adam with a learning rate of $10^{-3}$ ...
nalzok's user avatar
1vote
0answers
27views

How to balance time/effort with transformations, feature selection, and models efficacy in nlp? [closed]

Edit: Question has been edited for reopening (see comment section for justification) Being to new text analytics, I haven't gotten the hang of navigating a typical workflow given the longer times ...
Josh's user avatar
  • 141
2votes
1answer
147views

How can we conclude that an optimization algorithm is better than another one

When we test a new optimization algorithm, what the process that we need to do?For example, do we need to run the algorithm several times, and pick a best performance,i.e., in terms of accuracy, f1 ...
user82620's user avatar
1vote
0answers
99views

Is there a machine learning framework that supports partial evaluation ie can return a function?

Is there a machine learning framework that supports partial evaluation? For example: We train on [model, year, km, ..., colour, price]. Today we call ...
Adam Bittlingmayer's user avatar
1vote
1answer
246views

Optimal parameter estimation for a classifier with multiple parameters

The image on the left shows a standard ROC curve formed by sweeping a single threshold and recording the corresponding True Positive Rate (TPR) and False Positive Rate (FPR). The image on the right ...
Rahul Murmuria's user avatar

close